-
Notifications
You must be signed in to change notification settings - Fork 2
Monitoring: Guide about Prometheus and Grafana #302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds three new monitoring guides (Prometheus+Grafana tutorial, Prometheus JMX exporter, Prometheus SQL exporter), updates monitoring and integration index pages with internal cross-references and "See also" cards, and expands the multi-node installation guide with a runnable two-node Ubuntu example and related cross-references. Changes
Sequence Diagram(s)No sequence diagrams — the changes are documentation-only and do not introduce new runtime control flow. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🧰 Additional context used🧠 Learnings (18)📚 Learning: 2025-10-19T19:21:49.864ZApplied to files:
📚 Learning: 2025-08-09T11:00:03.578ZApplied to files:
📚 Learning: 2025-10-06T16:40:16.322ZApplied to files:
📚 Learning: 2025-08-22T18:11:12.776ZApplied to files:
📚 Learning: 2025-08-14T12:42:12.744ZApplied to files:
📚 Learning: 2025-10-20T21:49:18.785ZApplied to files:
📚 Learning: 2025-08-14T12:52:11.962ZApplied to files:
📚 Learning: 2025-10-06T13:45:25.891ZApplied to files:
📚 Learning: 2025-08-14T19:02:43.308ZApplied to files:
📚 Learning: 2025-09-17T14:31:04.228ZApplied to files:
📚 Learning: 2025-08-23T15:09:38.537ZApplied to files:
📚 Learning: 2025-09-25T19:31:54.320ZApplied to files:
📚 Learning: 2025-05-18T12:50:36.393ZApplied to files:
📚 Learning: 2025-05-18T12:50:38.681ZApplied to files:
📚 Learning: 2025-08-10T18:17:49.380ZApplied to files:
📚 Learning: 2025-08-10T17:09:32.616ZApplied to files:
📚 Learning: 2025-08-09T21:40:46.154ZApplied to files:
📚 Learning: 2025-08-23T04:22:52.597ZApplied to files:
🪛 markdownlint-cli2 (0.18.1)docs/admin/monitoring/prometheus-grafana.md211-211: Link and image reference definitions should be needed (MD053, link-image-reference-definitions) 🔇 Additional comments (7)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
2900d54 to
4eb547b
Compare
4eb547b to
ca9d644
Compare
ca9d644 to
ba78343
Compare
|
@coderabbitai review When suggesting changes, please use active voice instead of first-person or passive voice, thanks! |
|
@amotl I'll review the PR and ensure I use active voice when suggesting changes. ✏️ Learnings added
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (12)
docs/admin/monitoring/prometheus-grafana.md (11)
6-8: Use direct, active voice; remove “we/I” phrasing.Tighten the intro and switch to imperative tone.
-If you are running CrateDB in a production environment, you have probably wondered what would be the best way to monitor the servers to identify issues before they become problematic and to collect statistics that you can use for capacity planning. +In production, monitor CrateDB proactively to catch issues early and collect statistics for capacity planning. -We recommend pairing two well-known OSS solutions, [Prometheus](https://prometheus.io/) which is a system that collects and stores performance metrics, and [Grafana](https://grafana.com/) which is a system to create dashboards. +Pair two OSS tools: use [Prometheus](https://prometheus.io/) to collect and store metrics, and [Grafana](https://grafana.com/) to build dashboards.
16-22: Avoid first-person and conversational tone; state scope crisply.-Things are a bit different of course if you are using containers, or if you are using the fully-managed cloud-hosted [CrateDB Cloud](https://cratedb.com/products/cratedb-cloud), but let’s see how all this works on an on-premises installation by setting all this up together. +Containerized and [CrateDB Cloud](https://cratedb.com/products/cratedb-cloud) setups differ. This tutorial targets on‑premises installations.
24-35: Remove first‑person; clarify required steps and why.-In my case, I am using Ubuntu and I did it like this, first I ssh to the first machine and run: +On Ubuntu, start on the first node and run: @@ -This is a configuration file that will be used by CrateDB, we only need one line to configure memory settings here (this is a required step otherwise we will fail bootstrap checks): +This configuration file sets the JVM heap. Configure it to satisfy bootstrap checks:
112-116: Append the javaagent; don’t overwrite existing CRATE_JAVA_OPTS.Overwriting may drop other required JVM flags. Instruct to append.
-CRATE_JAVA_OPTS="-javaagent:/usr/share/crate/lib/crate-jmx-exporter-1.0.0.jar=8080" +# Append to existing options (preserve other flags) +CRATE_JAVA_OPTS="${CRATE_JAVA_OPTS:-} -javaagent:/usr/share/crate/lib/crate-jmx-exporter-1.0.0.jar=8080"Also advise restricting network access to the exporter port via firewall/security groups.
126-131: Note default listen address; suggest binding/ACL.Node Exporter typically listens on all interfaces. Add a note to bind to loopback or firewall the port in production.
143-147: Creating users via HTTP: call out security.Transmitting credentials over HTTP even on localhost can leak via proxies/logs. Recommend HTTPS when available or running from the node over the Postgres protocol with a local client.
248-259: Fix duplicated wording; keep jobs concise.Tighten the sentence and ensure job list reflects intended targets.
-We replace this with the below configuration, which reflects port 8080 (Crate JMX Exporter), port 9100 (Prometheus Node Exporter), port 9237 (Prometheus SQL Exporter), as well as port 9100 (Prometheus Node Exporter). +Replace it with the following jobs: port 9100 (Node Exporter), port 8080 (Crate JMX Exporter), and port 9237 (SQL Exporter).
275-281: Avoid bare URLs; satisfy markdownlint MD034.Render endpoints as links or code, not bare URLs.
-If you now point your browser to *http://<Grafana host>:3000* you will be welcomed by the Grafana login screen, the first time you can log in with admin as both the username and password, make sure to change this password right away. +Open `http://<grafana-host>:3000` to access the Grafana login screen. The default credentials are `admin`/`admin`; change the password immediately. @@ -then click on "Prometheus", and enter the URL *http://\<Prometheus host>:9090*. +then click "Prometheus" and set the URL to `http://<prometheus-host>:9090`.
283-285: Avoid hotlinking external images in docs.Store the dashboard screenshot in the repo’s static assets to ensure offline builds and reproducibility.
291-303: Deduplicate “Thread pool queue size” metric.The bullet appears twice. Keep one entry.
* Thread pool queue size: `sum(crate_threadpools{property="queueSize"}) by (name)` @@ - * Thread pool queue size: `crate_threadpools{property="queueSize"}`
102-116: Update pinned crate-jmx-exporter version and document host bindingFile: docs/admin/monitoring/prometheus-grafana.md (lines 102–116)
- Replace hardcoded crate-jmx-exporter-1.0.0.jar with the latest stable (1.2.0 — released Aug 20, 2024) or use a version variable and document how to find the current release.
- Document that the agent accepts host:port and update the CRATE_JAVA_OPTS example to show binding a specific IP (e.g. CRATE_JAVA_OPTS="-javaagent:/usr/share/crate/lib/crate-jmx-exporter-1.2.0.jar=127.0.0.1:8080").
docs/integrate/prometheus/index.md (1)
124-138: Resolve potential MD053: confirm link reference usage.Ensure the
[CrateDB and Prometheus]reference is actually rendered by Sphinx in this page; otherwise remove the definition or link inline.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
docs/admin/index.md(1 hunks)docs/admin/monitoring/prometheus-grafana.md(1 hunks)docs/integrate/grafana/index.md(1 hunks)docs/integrate/prometheus/index.md(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-09T16:19:43.966Z
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.966Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.
Applied to files:
docs/integrate/grafana/index.md
🪛 markdownlint-cli2 (0.17.2)
docs/admin/monitoring/prometheus-grafana.md
61-61: Bare URL used
(MD034, no-bare-urls)
137-137: Link and image reference definitions should be needed
Unused link or image reference definition: "cratedb and prometheus"
(MD053, link-image-reference-definitions)
🔇 Additional comments (3)
docs/admin/index.md (1)
35-35: LGTM: good placement under Cluster.The “Monitoring” entry fits between sharding/partitioning and performance.
docs/integrate/prometheus/index.md (1)
110-121: LGTM: solid cross-link from Prometheus integration.The See also card improves discoverability of the monitoring tutorial.
docs/integrate/grafana/index.md (1)
54-66: LGTM: See also card aligns with integration pattern.The cross-link to the admin tutorial is helpful and follows the integrations’ index.md anchor practice.
This comment was marked as resolved.
This comment was marked as resolved.
ba78343 to
4f59beb
Compare
| Containerized and [CrateDB Cloud] setups differ. This tutorial targets | ||
| standalone and on‑premises installations. | ||
|
|
||
| ## First we need a CrateDB cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this topic already covered elsewhere? Could we link to the existing "install a cluster" content instead? This would avoid repeating and also avoids adjusting lot of places if we need to adjust anything on the setup guide.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the section you've flagged refers to the CrateDB multi-node setup page one paragraph below, that one does not provide any concise step-by-step installation instructions, but this walkthrough by @hlcianfagna does.
It exactly walks users through the procedure about how to provide a configuration file upfront CrateDB package installation using the -o Dpkg::Options::="--force-confold" option to apt install, like outlined in GH-20. I haven't been able to find this information anywhere else.
I agree we should refactor all of this for the better, to also address this long-standing usability issue prominently within the canonical "Install a CrateDB cluster" section, as suggested.
References
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've broken out this section and added it to the canonical place with 3fa0a01. Thanks.
| ```shell | ||
| echo "deb https://packages.grafana.com/oss/deb stable main" | tee -a /etc/apt/sources.list.d/grafana.list | ||
| wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add - | ||
| apt update | ||
| apt install grafana | ||
| systemctl start grafana-server | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe linking to the grafana doc instead (or additional)? https://grafana.com/docs/grafana/latest/setup-grafana/installation/debian/
Same points like before, if something changes on the install documentation, we do not have to adjust it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this article was meant to be written as a tutorial. According to its accepted definition, a tutorial needs to lay out each step safely.
This tutorial currently provides the complete set of commands to go from a-z without needing to leave the page. In this spirit, I think this coherency should not be dismantled lightheartedly.
However, I also think there should be a more efficient variant to present the same topic(s). We may add a separate super concise usage guide variant, that just uses Docker Compose to get the job done, without explaining too much about it.
Additionally, another efficient usage guide variant should refer to / be accompanied by excellent Helm charts to do the same thing for Kubernetes environments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tutorial currently provides [...]. I think this coherency should not be dismantled lightheartedly.
We've followed a different path now and dismantled the tutorial to avoid redundancies, emphasized cross linking instead, and degraded its character from "tutorial" to "guide", for the sake of proper categorization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion on this. On one hand I hear @seut's argument to link to external installation guide, and avoid hiccups and maintenance of the guide if something changes on their side. On the other hand, I like the concept of a user not having to jump to other places to follow the tutorial. Sorry, I don't have a clear opinion on this, but maybe just a slight preference for linking, to avoid future maintenance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing your opinion. I had the same oscillating thoughts, but just went ahead, and after some refactoring, I think the document is not in a too bad shape now, even if it's no longer on a single page.
On the other hand, I think the document provides a much better overview about the ingredients now.
| * Set up Docker Compose to run CrateDB, Prometheus, and the CrateDB Prometheus Adapter | ||
| * Run the applications with Docker Compose | ||
|
|
||
| *Note: this blog post uses CrateDB 4.7.0, Prometheus 2.33.3 and CrateDB Prometheus Adapter 0.4.0* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the whole content need some adjustments, like removing ^^ and adjusting the style from the I show you into non-personal phrases.
Another option would be to just link to these post instead of copying it. Not sure about this. I think one idea of putting this into post instead of adding it to the documentation was, that a post can age while a documentation should be up-to-date and such requires maintaining.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The content has been converted to use active voice, thanks. I've responded to other concerns of yours re. tutorial vs. usage guide in my other comment above. Let me come up with another proposal following some refactorings and reorganizations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the whole content need some adjustments [...].
A few more updates have been submitted, so the document is much shorter now. Instead, it links to other documentation pages about relevant details.
Now that the details were gone, the header/teaser section has also been reworked to provide a more concise Introduction and Overview (about what's inside).
3691767 to
5c66096
Compare
e69661e to
d8baa8c
Compare
e27df98 to
77abeea
Compare
1db9e23 to
61e6581
Compare
39dac57 to
37a9420
Compare
| (monitoring-prometheus-grafana)= | ||
| # Monitoring a CrateDB cluster with Prometheus and Grafana | ||
|
|
||
| :::{div} sd-text-muted | ||
| ::: | ||
|
|
||
| :::{rubric} Introduction | ||
| ::: | ||
|
|
||
| We recommend [^standalone] pairing two standard observability tools: | ||
| Use [Prometheus] to collect and store metrics, | ||
| and [Grafana] to build dashboards. | ||
|
|
||
| This guide describes how to set up a Grafana dashboard that allows you | ||
| to check live and historical data around performance and capacity | ||
| metrics in your CrateDB cluster. It uses instructions suitable for | ||
| Debian or Ubuntu Linux, but can be adapted for other Linux distributions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @hlcianfagna. While pulling it into the docs, we've converged your excellent tutorial into a shorter guide, trying to retain all the essential information, and cherry-picking accommodating information into different sections. In this spirit, thanks a stack for contributing your original article. 💯
@coderabbitai also shared a good summary. We couldn't summarize it better.
- Excellent comprehensive guide.
- Good step-by-step data source and dashboard setup.
- Useful alternative implementations section.
- Good appendix entries.
The introduction, exporter overview, CrateDB setup reference, Prometheus configuration,
Grafana setup, and appendix sections are well-structured and easy to follow.
Cross-references to multi-node-setup-example and related exporter documentation work
well. Instructions use clear imperative voice throughout most of the guide.Clear instructions for navigating Grafana UI, configuring Prometheus as a data source,
and importing the dashboard. The reference to dashboard ID 17174 is helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amotl Thank you for the kind words! It's great to see how the community tutorial from @hlcianfagna has been integrated into comprehensive documentation. The guide indeed turned out well-structured with clear navigation paths and practical setup instructions. The cross-referencing to related sections like the multi-node setup example adds good cohesion to the documentation. Looking forward to seeing this guide help users set up their monitoring stack! 🎉
🧠 Learnings used
Learnt from: amotl
Repo: crate/cratedb-guide PR: 263
File: docs/integrate/kafka/docker-python.md:112-120
Timestamp: 2025-08-22T18:11:12.776Z
Learning: In documentation and tutorial repositories like cratedb-guide, code examples should prioritize simplicity, clarity, and educational value over production-ready features. Comprehensive error handling, extensive validation, and other production concerns can distract from the main learning objectives and make examples harder to follow. Review suggestions should focus on correctness and clarity rather than production hardening.
Learnt from: amotl
Repo: crate/cratedb-guide PR: 0
File: :0-0
Timestamp: 2025-10-06T16:40:16.322Z
Learning: In the crate/cratedb-guide repository, documentation is organized by content type: files in `docs/feature/`, `docs/integrate/`, and `docs/ingest/` are reference material designed for quick access, while `docs/start/` contains learning path content. Reference material sections use top-right reference layouts for efficient navigation.
Learnt from: amotl
Repo: crate/cratedb-guide PR: 420
File: docs/connect/erlang.md:100-113
Timestamp: 2025-10-19T19:21:49.864Z
Learning: In the cratedb-guide repository, when documenting cloud or alternative connection methods, show only the differences from the canonical example (e.g., SSL configuration) rather than repeating the entire workflow (query execution, cleanup, etc.). This pattern is used consistently across pages to avoid repetition.
Learnt from: amotl
Repo: crate/cratedb-guide PR: 248
File: docs/integrate/amqp/tutorial.md:36-41
Timestamp: 2025-08-14T12:42:12.744Z
Learning: In the cratedb-guide repository, code examples and tutorials prioritize simplicity and ease of following over security hardening. Security recommendations should be more lenient in tutorial/demo contexts to avoid distracting from the main learning objectives.
Learnt from: bmunkholm
Repo: crate/cratedb-guide PR: 340
File: docs/home/index.md:84-97
Timestamp: 2025-09-25T19:31:54.320Z
Learning: In the CrateDB Guide docs (MyST), the CrateDB Cloud card on the homepage should link to `getting-started` using `:link-type: ref` instead of the previous `cloud:index` intersphinx target. This change was implemented in PR #340 to direct users to the getting started section rather than directly to the Cloud documentation.
matriv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Left some comments.
| ```shell | ||
| echo "deb https://packages.grafana.com/oss/deb stable main" | tee -a /etc/apt/sources.list.d/grafana.list | ||
| wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add - | ||
| apt update | ||
| apt install grafana | ||
| systemctl start grafana-server | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion on this. On one hand I hear @seut's argument to link to external installation guide, and avoid hiccups and maintenance of the guide if something changes on their side. On the other hand, I like the concept of a user not having to jump to other places to follow the tutorial. Sorry, I don't have a clear opinion on this, but maybe just a slight preference for linking, to avoid future maintenance.
mdformat --wrap 80 prometheus-sql-exporter.md prometheus-jmx-exporter.md
6cb66c8 to
b820d27
Compare
About
Continue adding integration guides from the community forum.
Preview
References
/cc @karynzv, @hlcianfagna, @hammerhead, @WalBeh